57 research outputs found

    k-SLAM: Accurate and ultra-fast taxonomic classification and gene identification for large metagenomic datasets

    Get PDF
    k-SLAM is a highly e cient algorithm for the characterisa- tion of metagenomic data. Unlike other ultra-fast metage- nomic classi ers, full sequence alignment is performed allow- ing for gene identi cation and variant calling in addition to accurate taxonomic classi cation. A k -mer based method provides greater taxonomic accuracy than other classi ers and a three orders of magnitude speed increase over align- ment based approaches. The use of alignments to nd vari- ants and genes along with their taxonomic origins enables novel strains to be characterised. k-SLAM's speed allows a full taxonomic classi cation and gene identi cation to be tractable on modern large datasets. A pseudo-assembly method is used to increase classi cation accuracy by up to 40% for species which have high sequence homology within their genus

    elPrep: high-performance preparation of sequence alignment/map files for variant calling

    Get PDF
    elPrep is a high-performance tool for preparing sequence alignment/map files for variant calling in sequencing pipelines. It can be used as a replacement for SAMtools and Picard for preparation steps such as filtering, sorting, marking duplicates, reordering contigs, and so on, while producing identical results. What sets elPrep apart is its software architecture that allows executing preparation pipelines by making only a single pass through the data, no matter how many preparation steps are used in the pipeline. elPrep is designed as a multithreaded application that runs entirely in memory, avoids repeated file I/O, and merges the computation of several preparation steps to significantly speed up the execution time. For example, for a preparation pipeline of five steps on a whole-exome BAM file (NA12878), we reduce the execution time from about 1: 40 hours, when using a combination of SAMtools and Picard, to about 15 minutes when using elPrep, while utilising the same server resources, here 48 threads and 23GB of RAM. For the same pipeline on whole-genome data (NA12878), elPrep reduces the runtime from 24 hours to less than 5 hours. As a typical clinical study may contain sequencing data for hundreds of patients, elPrep can remove several hundreds of hours of computing time, and thus substantially reduce analysis time and cost

    Whole genome sequencing reveals a 7 base-pair deletion in DMD exon 42 in a dog with muscular dystrophy

    Get PDF
    Dystrophin is a key cytoskeletal protein coded by the Duchenne muscular dystrophy (DMD) gene located on the X-chromosome. Truncating mutations in the DMD gene cause loss of dystrophin and the classical DMD clinical syndrome. Spontaneous DMD gene mutations and associated phenotypes occur in several other species. The mdx mouse model and the golden retriever muscular dystrophy (GRMD) canine model have been used extensively to study DMD disease pathogenesis and show efficacy and side effects of putative treatments. Certain DMD gene mutations in high-risk, the so-called hot spot areas can be particularly helpful in modeling molecular therapies. Identification of specific mutations has been greatly enhanced by new genomic methods. Whole genome, next generation sequencing (WGS) has been recently used to define DMD patient mutations, but has not been used in dystrophic dogs. A dystrophin-deficient Cavalier King Charles Spaniel (CKCS) dog was evaluated at the functional, histopathological, biochemical, and molecular level. The affected dog’s phenotype was compared to the previously reported canine dystrophinopathies. WGS was then used to detect a 7 base pair deletion in DMD exon 42 (c.6051-6057delTCTCAAT mRNA), predicting a frameshift in gene transcription and truncation of dystrophin protein translation. The deletion was confirmed with conventional PCR and Sanger sequencing. This mutation is in a secondary DMD gene hotspot area distinct from the one identified earlier at the 5â€Č donor splice site of intron 50 in the CKCS breed. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00335-016-9675-2) contains supplementary material, which is available to authorized users

    Whole-genome sequencing for an enhanced understanding of genetic variation among South Africans

    Get PDF
    The Southern African Human Genome Programme is a national initiative that aspires to unlock the unique genetic character of southern African populations for a better understanding of human genetic diversity. In this pilot study the Southern African Human Genome Programme characterizes the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole-genome sequencing. A total of ~16 million unique variants are identified. Despite the shallow time depth since divergence between the two main southeastern Bantu-speaking groups (Nguni and Sotho-Tswana), principal component analysis and structure analysis reveal significant (p < 10−6) differentiation, and FST analysis identifies regions with high divergence. The Coloured individuals show evidence of varying proportions of admixture with Khoesan, Bantu-speakers, Europeans, and populations from the Indian sub-continent. Whole-genome sequencing data reveal extensive genomic diversity, increasing our understanding of the complex and region-specific history of African populations and highlighting its potential impact on biomedical research and genetic susceptibility to disease

    NEK1 variants confer susceptibility to amyotrophic lateral sclerosis

    Get PDF
    To identify genetic factors contributing to amyotrophic lateral sclerosis (ALS), we conducted whole-exome analyses of 1,022 index familial ALS (FALS) cases and 7,315 controls. In a new screening strategy, we performed gene-burden analyses trained with established ALS genes and identified a significant association between loss-of-function (LOF) NEK1 variants and FALS risk. Independently, autozygosity mapping for an isolated community in the Netherlands identified a NEK1 p.Arg261His variant as a candidate risk factor. Replication analyses of sporadic ALS (SALS) cases and independent control cohorts confirmed significant disease association for both p.Arg261His (10,589 samples analyzed) and NEK1 LOF variants (3,362 samples analyzed). In total, we observed NEK1 risk variants in nearly 3% of ALS cases. NEK1 has been linked to several cellular functions, including cilia formation, DNA-damage response, microtubule stability, neuronal morphology and axonal polarity. Our results provide new and important insights into ALS etiopathogenesis and genetic etiology

    Mutations in the histone methyltransferase gene KMT2B cause complex early-onset dystonia.

    Get PDF
    Histone lysine methylation, mediated by mixed-lineage leukemia (MLL) proteins, is now known to be critical in the regulation of gene expression, genomic stability, cell cycle and nuclear architecture. Despite MLL proteins being postulated as essential for normal development, little is known about the specific functions of the different MLL lysine methyltransferases. Here we report heterozygous variants in the gene KMT2B (also known as MLL4) in 27 unrelated individuals with a complex progressive childhood-onset dystonia, often associated with a typical facial appearance and characteristic brain magnetic resonance imaging findings. Over time, the majority of affected individuals developed prominent cervical, cranial and laryngeal dystonia. Marked clinical benefit, including the restoration of independent ambulation in some cases, was observed following deep brain stimulation (DBS). These findings highlight a clinically recognizable and potentially treatable form of genetic dystonia, demonstrating the crucial role of KMT2B in the physiological control of voluntary movement.Funding for the project was provided by the Wellcome Trust for UK10K (WT091310) and DDD Study. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund [grant number HICF-1009-003] - see www.ddduk.org/access.html for full acknowledgement. This work was supported in part by the Intramural Research Program of the National Human Genome Research Institute and the Common Fund, NIH Office of the Director. This work was supported in part by the German Ministry of Research and Education (grant nos. 01GS08160 and 01GS08167; German Mental Retardation Network) as part of the National Genome Research Network to A.R. and D.W. and by the Deutsche Forschungsgemeinschaft (AB393/2-2) to A.R. Brain expression data was provided by the UK Human Brain Expression Consortium (UKBEC), which comprises John A. Hardy, Mina Ryten, Michael Weale, Daniah Trabzuni, Adaikalavan Ramasamy, Colin Smith and Robert Walker, affiliated with UCL Institute of Neurology (J.H., M.R., D.T.), King’s College London (M.R., M.W., A.R.) and the University of Edinburgh (C.S., R.W.)

    An Interest Filtering Mechanism Based on LoI

    No full text
    • 

    corecore